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Abstract 



In randomized trials, researchers are often interested in mediation analysis to understand 
how a treatment works, in particular how much of a treatment's effect is mediated by an 
intermediated variable and how much the treatment directly affects the outcome not through 
the mediator. The standard regression approach to mediation analysis assumes sequential 
ignorability of the mediator, that is that the mediator is effectively randomly assigned given 
baseline covariates and the randomized treatment. Since the experiment does not random- 
ize the mediator, sequential ignorability is often not plausible. Ten Have et al. (2007, 
Biometrics) , Dunn and Bentall (2007, Statistics in Medicine) and Albert (2008, Statistics in 
Medicine) presented methods that use baseline covariates interacted with random assignment 
as instrumental variables, and do not require sequential ignorability. We make two contri- 
butions to this approach. First, in previous work on the instrumental variable approach, 
it has been assumed that the direct effect of treatment and the effect of the mediator are 
constant across subjects; we allow for variation in effects across subjects and show what 
assumptions are needed to obtain consistent estimates for this setting. Second, we develop 
a method of sensitivity analysis for violations of the key assumption that the direct effect of 
the treatment and the effect of the mediator do not depend on the baseline covariates. 

Keywords: Causal Inference, Mediation Analysis, Instrumental Variables. 



1. Introduction 

Randomized trials are explicitly designed to estimate the effects of treatments but not 
how those effects occur. Yet, many researchers are interested in how treatments that are eval- 
uated using randomized experiments achieve their effects. Mediation analysis seeks to open 
up the "black box" of a treatment and explain how it works. For example, the PROSPECT 
study (Bruce et al, 2004) evaluated an intervention for improving treatment of depression 
in the elderly in primary care practices. The intervention consisted of having a depression 
specialist (typically a master's-level clinician) closely collaborate with the depressed patient 
and the patient's primary care physician to facilitate patient and clinician adherence to a 
treatment algorithm and provide education, support and ongoing assessment to the patient. 
The intervention significantly reduced depression (as measured by the Hamilton test) four 
months after baseline. Researchers of this study are interested in to what extent the effect of 
the intervention can be explained by its increasing use of prescriptive anti-depressant med- 
ication as compared to other factors. Understanding the mechanism by which a treatment 
achieves its effects can help researchers and policymakers design more effective treatments 
(Gennetian, Bos and Morris, 2002; Kraemer et al., 2002). For example, if the PROSPECT 
study intervention achieves its effects primarily through increasing use of antidepressants, 
then a more cost-effective intervention might be designed that has the depression specialist 
focus her time only on increasing use of antidepressants. 

The standard approach to mediation analysis (Judd and Kenny, 1981; Baron and Kenny, 
1986; MacKinnon et al., 2002) makes a strong sequential ignorability assumption that not 
only is the intervention randomly assigned, but the mediating variable (e.g., antidepres- 
sant) is also ignorable (i.e., there are no unmeasured confounders of the mediating variable- 
outcome relationship ) given the assigned intervention and measured confounding variables 
(Ten Have et al., 2007). In the PROSPECT study, potential unmeasured confounders of the 
mediating variable (antidepressant use)-outcome (depression) relationship include medical 
comorbidities during the follow-up period, which deter elderly depressed patients from tak- 
ing antidepressant medications because of so many other medications that are necessitated 
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by their medical comorbidities and also predisposes patients to more depression (Ten Have 
et al., 2007). To address such unmeasured confounding, Ten Have et al. (2007) develop an 
alternative approach to mediation analysis that relies on having a baseline covariate that 
interacts with random assignment in predicting the mediating variable, but that does not 
modify the causal effects of the random assignment and the mediating variable. For example, 
for the PROSPECT study, Ten Have et al. considered the baseline covariates baseline de- 
pression and baseline suicide ideation. Ten Have et al.'s approach to mediation analysis uses 
a rank preserving model for causal effects and g-estimation (Robins, 1994). The assumption 
underlying Ten Have et al.'s approach, that there is a baseline covariate that interacts with 
random assignment in predicting the mediating variable but that does not modify the causal 
effects of the random assignment and the mediating variable, can be viewed as an assumption 
that the baseline covariate interacted with random assignment is an instrumental variable 
(IV) for the mediating variable in a structural equation model. Dunn and Bentall (2007) 
show that two stage least squares estimation of this structural equation equation model 
with the baseline covariate interacted with random assignment as an IV produces essentially 
equivalent results to that of ^-estimation of the rank preserving model. Gennetian, Bos and 
Morris (2002), Albert (2008) and Joffe et al. (2008) provide further discussion of this two 
stage least squares approach. 

This paper makes two contributions to the approach of using baseline covariates inter- 
acted with random assignment as IVs for mediation analysis when sequential ignorability 
does not hold. First, in previous work on the instrumental variable approach, it has been 
assumed that the direct effect of treatment and the effect of the mediator are constant across 
subjects; we allow for variation in effects across subjects and show what assumptions are 
needed to obtain consistent estimates for this setting. Second, we develop a method of sen- 
sitivity analysis for violations of the key assumption that the direct effect of the treatment 
and the effect of the mediator do not depend on the baseline covariates. 

Our paper is organized as follows. Section 2 provides the notation and setup. Section 
3 describes the model we will consider. Section 4 reviews the standard regression approach 
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to mediation analysis. Section 5 presents the instrumental variables approach. Section 6 
develops a method of sensitivity analysis for the effect of departures from the key assumption 
that the baseline covariate does not modify the causal effects of the random assignment or 
the mediating variable. The methods are applied to the PROSPECT study. 2. Setup and 
Notation 

We assume there are N subjects who are an iid sample from a population. We assume 
that the treatment R is randomized. 

The observed variables for subject i are the following: Y,- L is the observed outcome, Ri is 
the observed randomized zero-one treatment assignment, Xj is a vector of observed baseline 
covariates other than treatment assignment and Mj is the observed mediation variable. The 
potential outcomes for subject i are Y^ r,m \ r = or 1 and m G M where M. is the set 
of possible values the mediating variable can take on; Y^'" 1 ^ is the outcome variable that 
would be observed if subject i were randomized to level r of the treatment and through some 
hypothetical mechanism were to receive or exhibit level m of the mediator. To establish a 
unique potential outcome, we assume that all such hypothetical mechanisms lead to the same 
potential outcome (Ten Have et al., 2007). The observed outcome Y± is equal to Y^ Rl ' Ml \ 

(r) (r) 

The potential mediating variables for subject i are Ml r = or 1; Ml ' is the level of the 
level of the mediating variable that would be observed if subject i were assigned level r of 

( Ft ■ ) 

the treatment. The observed mediating variable Mj equals M> . 

We let the random variables Y, R, X, yO> m )(r = 0, 1, m e M), M (r) (r = 0, 1) be the values 
of the observed outcome, treatment assignment, baseline covariates, potential outcomes and 
potential mediating variables for a randomly chosen subject from the population. 

3. Model 

We consider the following model for potential outcomes: 

>f' m) = Y<™ + 9 Mi m + Onj, (1) 

where the (Y^°'°\ Mi , 0rJ are iid random vectors. Here 9 Mi represents the effect for subject 
i of a one unit increase in the mediator on the outcome holding the treatment fixed at any 
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level r. The parameter 9 Ri represents the direct effect for subject i of the treatment on the 
outcome holding the mediator fixed at any level m. Let 9 m = E(9m,) be the average effect 
of a one unit increase in the mediator and Or = E{9r.) be the average direct effect of the 
treatment. 

4. Review of Standard Regression Approach 

The standard regression approach of Baron and Kenny (1986) is to estimate 9m and 9r 
by least squares regression of Y{ on Mj and Under the maintained assumption that R 
is randomized, the standard regression approach provides consistent estimates of 9m and Or 
under the additional assumption that M is sequentially ignorable given R: 

M i JLY i ( - Ri ' m \ m e M, (2) 

where Ai is the set of possible values of the mediating variable M. The sequentially ignorable 
assumption (j2J) means that M is effectively randomly assigned given R. Under model ([I]), 
the sequential ignorability assumption (F2J is equivalent to 

M l ALY} m ,9 Ml ,9 Rt . (3) 

See Imai, Keele and Yamamoto (2010) for further discussion of the sequential ignorabil- 
ity assumption. The sequentially ignorable assumption (j2J) will be violated if there are 
confounders of the mediator-outcome relationship. Measured baseline confounders of the 
mediator-outcome relationship can be controlled for by controlling for these confounders in 
the regression. If there are measured postbaseline confounders, the regression on the mea- 
sured confounders will produce an unbiased estimate of 9 M but not 9 R ; to obtain an unbiased 
estimate of 9 R , Y — 9 M can be regressed on R (Vansteelandt, 2009; Ten Have and Joffe, 2010). 

5. Instrumental Variables Approach 

The standard regression approach can only control for measured confounders of the 
mediator-outcome relationship. The IV approach using baseline covariates interacted with 
treatment assignments can control for unmeasured confounders when baseline covariate(s) 
interacted with treatment assignment are valid IVs. This IV approach for mediation anal- 
ysis models has been discussed by Dunn and Bentall (2007) and Albert (2008), and the 
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closely related ^-estimation approach has been discussed by Ten Have et al. (2007). These 
authors have considered models in which the direct effect of treatment and the effect of the 
mediating variable are the same for all subjects. We will allow these effect to vary from 
subject to subject as in ([1]) and provide conditions needed for the instrumental variable to 
be consistent. 

Denote a vector of baseline covariates by X. We assume that the association of X with 
the potential outcomes is linear: 

E(Y {0 ' 0) \X) = a + f3 T X (4) 
Then, we can write the observed data as 

Yi = /3 T X 4 + 9 R Ri + 9 M Mi + e h 

ei = (0* - e R )Ri + (e Mi - o M )M t + y/°' 0) - £(vf ' 0) |x. 4 ) (5) 

The least squares regression of Y on X, R and M will produce biased estimates if there are 
unobserved confounders of the mediator-outcome relationship that make 6j correlated with 
Mi. The method of instrumental variables (IVs) seeks to replace Mj with its expectation 
given instrumental variables that help to predict Mj and are uncorrelated with e». The 
interactions between the baseline covariates X and R are valid IVs if the following conditions 
hold: 

(IV- Al) The interaction between R and X is helpful for predicting M in a linear model, i.e., 
E*(M\R,X) ^ E*(M\R,X,RX) where E*(M\A) = argmin A £(M- A T A) 2 denotes 
the best linear predictor of M given A. 

(IV- A2) The average direct effect of the treatment given X, E(9 Ri \'X.i) = X, is the same for all 
X, i.e., E(9 Ri \'X.i) = X) = 9 R for all X. Likewise, the average effect of the mediating 
variable given X, i?(^/jXj = X), is the same for all X, i.e., £7(0^ |Xj = X) = 9m for 
all X. 
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(IV- A3) The value of the mediating variable is independent of the effect of the mediating variable 
given the treatment and the baseline covariates 

M^mJ^X, (6) 

(IV- Al) says that RX. helps to predict M. (IV- A2) and (IV- A3), and the assumption that 
R is randomly assigned, together guarantee that RX is uncorrelated with which we show 
in the following. 

Proposition 1: Under (IV- A2) and (IV- A3) and the assumption that R is randomly 
assigned, each component of R x X, is uncorrelated with 6j. 

Proof: Consider a component of R x X*, RX a . From (jSJ), = (9^ — 9 R )Ri + (^m, — 
e M )M i + {Y^ m -E{Yl; m \K i )}. We will prove that Cov(RX n , q) = by showing that RX iX 
is uncorrelated with each of the three summands that make up e^, namely (i) Cov(RXn, {9 Ri — 
9 R )R t ) = 0; (ii) Cov(RX a , (9 Mi -9 M )M i ) = and (hi) Cov(RX a , v/°' 0) - £(v/°' 0) |X,)) = 0. 
For (i), since R4 is randomized, we have E[(9 Ri — 9 R )Ri] = so that Cov(RiXn, (9 Ri — 
9ft)Ri) = E{RiX i \{9ji i — 9 R )Ri). Furthermore, we have 

E(R i X il (9 Ri -9 R )R i ) = E{Rj)E{X ll {9 Ri - 9 R )) 

= 0, 

where the first equality follows from the fact that R is randomized and the second equality 
follows from (IV-A2). This proves (i). For (ii), we first note that 

E[{9 Mi -9 M )M i ] = E[E[(9 Mi -9 M )M l \R i ,X l }] 

= E[E[(9 Mi - 9M)\Ri,Xi]E[M i \R i ,X i \] 
= 0, 

where the second equality follows from (IV-A3) and the third equality follows from (IV-A2) 
and the fact that R is randomized. Thus, Cov(RiXn, (9^^ — 9m)Mi) = E(RiXn(9M t — 
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M )Mi), and 

E{R i X il {9 Mi - 9 M )Mi) = EiEiRiXaiOM, - 9 M )M i \R i , X,]] 

= E[RiXiiE[(8 Mt — 8 M )M i \R i ,~K i ] 
= E[R i X il E[(9 Mi -9 M )\R i ,X i }E[M i \R i ,X i ]} 
= 0, 

where the third equality follows from (IV-A3) and the fourth equality follows from (IV-A2) 
and the fact that R is randomized. This proves (ii). For (iii), 

CoviRiX^Y} ^ - EiY^lXi]) = E[RiX a {Y< ' 0) - E[Y^ 0) \X t }}} 

= E(R i )E[X il {Y^' 0) - EiY^lX,}}] 
= 0, 

where the second equality follows from R being randomized and third equality from proper- 
ties of conditional expectation. This proves (iii). □ 

Assumption (IV-A3) is weaker than the sequential ignorability assumption (j2J) because 
(IV- A3) does not say that Y^ ' ^ is independent of Mj. Assumption (IV- A3) says that 
the level of the mediating variable is independent of the effect the mediating variable has, 
while sequential ignorability says that not only is the level independent of the effect, but 
also the level is independent of all the person's potential outcomes. In the context of the 
PROSPECT study, (IV- A3) says that antidepressant use is independent of the effect that the 
antidepressant would have, while sequential ignorability says that not only is antidepressant 
use independent of its effect, but antidepressant use is also independent of unmeasured 
medical comorbidities and any other unmeasured variables that affect depression. Note that 
(IV-A3) is automatically satisfied if 9 Ri and 9m { if 9r. and 9^ are the same for all subjects 
as is assumed by Ten Have et al. (2007), Dunn and Bentall (2007) and Albert (2008). 

Under (IV- A2)- (IV- A3), we have 

E*(Y\R,X,R x X) = a + f3 T X + 9 R R + 9 M E*(M\R,X, R x X) + E*(e\R,X, R x X) 

= a + f3 T X + 9 R R + 9 M E*(M\R,X,RxX), 



7 



The two-stage least squares estimates of 6 R and M are found as follows: 

1. Regress M on R, X and R x X using least squares and obtain the predicted values 
E{M\R,X,Rx X). 

2. Regress Y on R, X and £^(M|i2, X, i? x X) using least squares. The coefficient on R is 
Or and the coefficient on E(M\R, X, i? x X) is M - 

Using the theory of instrumental variables for single-equation linear models (Wooldridge, 
2002, Ch. 5), the two stage least squares estimates are consistent under (IV-Al)-(IV-A3) 
because (i) Cov(R x X, e) = under (IV-A2)-(IV-A3) and (ii) the coefficient on R x X in 
the linear projection of Y onto R, X and R x X is not under (IV-A1). 

We now discuss the variance-covariance matrix of k — (a, f3, Or, 6m)- First, consider the 
following additional assumptions: 

(AA-1) The distribution of the direct effect of the treatment and the effect of the mediating 
variable do not depend on Xj, 

6r,u 6> Mii _LLXj. 

(AA-2) Var({y/ 0,0) - E(Y} m )}\X t = X) is the same for all X. 

Under (AA-l)-(AA-2), the Var(e i \R i ,~K i ) is the same for all i2j,Xj. Then a consistent 
estimate of the variance-covariance matrix of k is <3f(A T A) _1 where of = jf^2f =1 ^, h = 
Yi — a — (3 Xj — R Ri — M Mi and A is a matrix with N rows consisting of a column of 
ones, columns for each of the variables in X for the N subjects, a column of the values of 
R for the N subjects and a column of the values of E*(M\R, X, fixX) for the N subjects 
(Wooldridge, 2002, Ch. 5). By a consistent estimate of the covariance matrix, we mean 
that \fNCov(k N ) is a consistent estimator of VNCov(kN), where k N is the two stage least 
squares estimator of k based on N observations. 

Suppose that either (a) the Y^ ' ^ — £ , (F/°'°' ) |Xj) have a distribution that depends on X^ 
and/or (b) the direct effects of treatment and the effect of the mediating variable have a 
distribution that might depend on X but the mean is the same for all X, i.e., £ , (^ i j|X i ) = Or 
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and £ , (6*Af,j|Xj) = 6 M . Then, the two stage least squares estimate remains consistent, but the 
usual standard error might be inconsistent. A consistent estimate of the covariance matrix 
under regularity conditions (White, 1982; Wooldridge, 2002, Ch. 5.2.5) is the "sandwich" 
estimator, (A^A)" 1 ef AfA;) (A T A)"\ where A, = (1, X,, R u M t ) T . 

Inferences from two stage least squares become unreliable if the IV(s) are "weak," which 
in our setting means that the interaction between R and X is only a weak predictor of M in 
the linear model, i.e., E*(M \R, X/?X). Specifically, when the IV(s) are weak, the two stage 
least squares estimates can have a large bias in the direction of the ordinary least squares 
estimates of Y on X, R and M, and the coverage of the confidence intervals for the two 
stage least squares estimates can be poor (Bound, Jaeger and Baker, 1995). Stock, Wright 
and Yogo (2002) provided a criterion for when IV inference is reliable based on the partial 
F statistic for testing that the coefficient on the i?xX variable are zero from the first stage 
regression of M on R, X and R x X. Inference can be expected to reliable when this F 
statistic is greater than 8.96, 11.59, 12.83, 15.09, 20.88 and 26.80 for 1, 2, 3, 5, 10 and 15 
variables in X respectively. This criterion is based on the goal of having a nominal 0.05 level 
test of the coefficient on M have at most actual level 0.15, and the chance that we falsely 
say that a nominal 0.05 level test of M has at most actual level 0.15 be at most 0.05. 

5.1 Application to PROSPECT study 

We use the PROSPECT study data set provided by Ten Have et al. (2007) under the 



Article Information link at the Biometrics website http://www.tibs.org/biometricsl There 
are 297 subjects, 145 were randomized to the intervention and 152 to the control. The out- 
come is the subject's Hamilton score (a measure of depression, with a higher score indicating 
more depression) four months after the intervention. The mediating variable is an indica- 
tor for whether the subject used antidepressants during the period from the intervention to 
four months after the intervention. The intervention significantly increases the mediator - 
the intervention is estimated to multiply the odds of antidepressant use by 6.7 with a 95% 
confidence interval of (3.9, 11.7). 

The second row of Table 1 shows estimates from the standard regression approach. Base- 
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line covariates are baseline Hamilton score, a baseline indicator of whether the subject had 
suicide ideation, the site at which the subject was treated (Cornell, University of Pennsylva- 
nia or University of Pittsburgh) an indicator of whether the subject has used antidepressants 
in the past and a baseline ordinal measure of antidepressant use. The intervention is esti- 
mated to have a direct effect of reducing depression and antidepressant use is estimated 
to reduce depression, with the direct effect being significant but the mediator effect not 
significant. 

Following Ten Have et al. (2007), we consider as instrumental variables the interaction 
between the randomized intervention and two of the baseline covariates, (i) indicator of 
whether the subject has used antidepressants in the past and (ii) baseline ordinal measure of 
antidepressant use. The partial F statistic for the instruments in the first stage regression is 
27.13 indicating that these are not weak instruments. The two stage least squares estimates 
are shown in the third row of Table 2. The confidence intervals are based on the assumption 
that the are homoskedastic, but the confidence intervals are similar if we use the sandwich 
covariance estimates that allow for heteroskedasticity. 



Method 


Direct effect of intervention 


Mediator effect 


Standard Regression 


-2.74 (-0.94, -4.54) 


-1.17 (-3.31, 0.97) 


IV 


-0.94 (-3.92,2.04) 


-2.87 (-8.89, 3.15) 



Table 1: Estimates for the direct effect of the intervention and the mediator (antidepressant 
use) effect in the PROSPECT study. 95% confidence intervals are in parentheses. 

6. Sensitivity Analysis 

In this section, we will consider the sensitivity of inferences to violations of assumption 
(IV-A2) that the average direct effect of the treatment given X and the average effect of the 
mediating variable given X are the same for all X. Consider the following parametric family 
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of violations of assumption (IV- A2): 

tfpfljx, = x] = e R + r T R (x t - e[x\), 

E[e Mi \Xi = X] = 9 M + t£(X, - £[X]). (7) 

(IV-A2) is satisified if tr = and = 0. Suppose we know the value of t r , tm and 
-EfX]. Then, we can write, 

Y t - Ri^Xi - E[X\) - M iT T M (X - E[X\) = (3 T Xi + 9 R R, + 9 M M { + q, 
e, = {On, - EiO^X^Ri + (9 Mi - E(9 Mi \X i ))M i + if ' 0) - ^y/ ' ^) (8) 

Now, we show that Rj x X« are valid IVs for estimating Or and Om when the response 
variable is Y { - t£(X< - E[X]) - T^(X; - E[X]. 

Proposition 2: Under (J7J), (IV-A3) and the assumption that R is randomly assigned, 
each component of R x Xj is uncorrelated with 6j. 

Proof: Consider a component of i? x Xj, RXa. From (jBJ), Cj = (9^ — E(9 Ri \Xi))Ri + 
(9 M . - E{9 Mi \X i ))M i + {Y^ m - E(Y^' 0) \X t )}. We will prove that Cov^RX^ei) = by 
showing that RXn is uncorrelated with each of the three summands that make up £j, namely 
(i) Cov(RX a , {9 Ri - E{9 Ri \X i ))R i ) = 0; (ii) Cov(RX a , (9 Mi - E{9 Mi \X i ))M i ) = and (hi) 
Cov{RX iU Y$ m - E(Y}°' 0) \Xi)) = 0. For (i), since Jfc is randomized, we have £[(0^. - 
E{9 Ri \X i ))R i ) = so that Cov(R t X tl , (9^- E(9 Ri \X i ))R i ) = E(R t X tl (9 Rt - E(9 Rt \X i ))R t ). 
Furthermore, we have 

E{R t X ll {9 R% - E{9 Ri \X i ))R i ) = E{Rj)E{X ll {9 Ri - E(9 Ri \X l ))) 

= 0, 

where the first equality follows from the fact that R is randomized and the second equality 
follows from properties of conditional expectation. This proves (i). For (ii), we first note 
that 

E[(9 Mi -E(9 Mi \X l ))M l ] = E[E[(9 Mi -E(9 Ah \X i ))M l \R i ,X l }} 

= E[E[9 Mi - E(9 Mi \X i )\R i ,X i ]E[M i \R i ,X i ]] 
= 0, 
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where the second equality follows from (IV-A3) and the third equality follows from the fact 
that R is randomized and properties of conditional expectation. Thus, Cov{RiX ill {6 Mi — 
E{e Mi \&i))Mi) = E{R i X il {9 Mi - ^mJX^M*), and 

E{R i x il {e Mi -E{e Mi \yu))M i ) = E[E[Fux iX (e Mi - e^Ix^m^, x,]] 

= EiRiXaEWM* - J B(^M i |X i ))M i | J R i ,X i ]] 

= EiRiXaE^-EiOM^^lRi^EiMilRi,^]] 

= 0, 

where the third equality follows from (IV-A3) and the fourth equality follows from the fact 
that R is randomized and properties of conditional expectation. This proves (ii). For (iii), 

CoviRtXa^ - £[y/°' Q) |Xi]) = E^XaiY^ - £[lf' 0) |X ?; ]}] 

= EiR^ElXaiY^ - £[y/°' 0) |X]}] 
= 0, 

where the second equality follows from R being randomized. This proves (iii). □ 

Based on Proposition 2, we can make inferences for Or and 9m under (IV-A1), (IV- A3) 

and © by replacing Y { by - RiT^Xi - E[X\) - MiT M (X - E[X] in the two stage least 

squares inference procedure from Section 5. Table 2 shows the results of a sensitivity analysis. 

The first component of tr and tm corresponds to past antidepressant use and the second 

component corresponds to baseline antidepressant use. 

Dedication. This paper is dedicated to my friend and mentor Tom Ten Have. Tom 

provided a lot of insightful suggestions in the early stage of this work, and unfortunately 

passed away before I could discuss the later stages of the work with him. 
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